45 research outputs found

    Bayesian machine learning methods for predicting protein-peptide interactions and detecting mosaic structures in DNA sequences alignments

    Get PDF
    Short well-defined domains known as peptide recognition modules (PRMs) regulate many important protein-protein interactions involved in the formation of macromolecular complexes and biochemical pathways. High-throughput experiments like yeast two-hybrid and phage display are expensive and intrinsically noisy, therefore it would be desirable to target informative interactions and pursue in silico approaches. We propose a probabilistic discriminative approach for predicting PRM-mediated protein-protein interactions from sequence data. The model suffered from over-fitting, so Laplacian regularisation was found to be important in achieving a reasonable generalisation performance. A hybrid approach yielded the best performance, where the binding site motifs were initialised with the predictions of a generative model. We also propose another discriminative model which can be applied to all sequences present in the organism at a significantly lower computational cost. This is due to its additional assumption that the underlying binding sites tend to be similar.It is difficult to distinguish between the binding site motifs of the PRM due to the small number of instances of each binding site motif. However, closely related species are expected to share similar binding sites, which would be expected to be highly conserved. We investigated rate variation along DNA sequence alignments, modelling confounding effects such as recombination. Traditional approaches to phylogenetic inference assume that a single phylogenetic tree can represent the relationships and divergences between the taxa. However, taxa sequences exhibit varying levels of conservation, e.g. due to regulatory elements and active binding sites, and certain bacteria and viruses undergo interspecific recombination. We propose a phylogenetic factorial hidden Markov model to infer recombination and rate variation. We examined the performance of our model and inference scheme on various synthetic alignments, and compared it to state of the art breakpoint models. We investigated three DNA sequence alignments: one of maize actin genes, one bacterial (Neisseria), and the other of HIV-1. Inference is carried out in the Bayesian framework, using Reversible Jump Markov Chain Monte Carlo

    Query Training: Learning a Worse Model to Infer Better Marginals in Undirected Graphical Models with Hidden Variables

    Full text link
    Probabilistic graphical models (PGMs) provide a compact representation of knowledge that can be queried in a flexible way: after learning the parameters of a graphical model once, new probabilistic queries can be answered at test time without retraining. However, when using undirected PGMS with hidden variables, two sources of error typically compound in all but the simplest models (a) learning error (both computing the partition function and integrating out the hidden variables is intractable); and (b) prediction error (exact inference is also intractable). Here we introduce query training (QT), a mechanism to learn a PGM that is optimized for the approximate inference algorithm that will be paired with it. The resulting PGM is a worse model of the data (as measured by the likelihood), but it is tuned to produce better marginals for a given inference algorithm. Unlike prior works, our approach preserves the querying flexibility of the original PGM: at test time, we can estimate the marginal of any variable given any partial evidence. We demonstrate experimentally that QT can be used to learn a challenging 8-connected grid Markov random field with hidden variables and that it consistently outperforms the state-of-the-art AdVIL when tested on three undirected models across multiple datasets

    International Journal of Cancer / Synergistic crosstalk of hedgehog and interleukin6 signaling drives growth of basal cell carcinoma

    Get PDF
    Persistent activation of hedgehog (HH)/GLI signaling accounts for the development of basal cell carcinoma (BCC), a very frequent nonmelanoma skin cancer with rising incidence. Targeting HH/GLI signaling by approved pathway inhibitors can provide significant therapeutic benefit to BCC patients. However, limited response rates, development of drug resistance, and severe side effects of HH pathway inhibitors call for improved treatment strategies such as rational combination therapies simultaneously inhibiting HH/GLI and cooperative signals promoting the oncogenic activity of HH/GLI. In this study, we identified the interleukin6 (IL6) pathway as a novel synergistic signal promoting oncogenic HH/GLI via STAT3 activation. Mechanistically, we provide evidence that signal integration of IL6 and HH/GLI occurs at the level of cisregulatory sequences by cobinding of GLI and STAT3 to common HHIL6 target gene promoters. Genetic inactivation of Il6 signaling in a mouse model of BCC significantly reduced in vivo tumor growth by interfering with HH/GLIdriven BCC proliferation. Our genetic and pharmacologic data suggest that combinatorial HHIL6 pathway blockade is a promising approach to efficiently arrest cancer growth in BCC patients.(VLID)301234

    Segmenting bacterial and viral DNA sequence alignments with a trans-dimensional phylogenetic factorial hidden Markov model

    Get PDF
    The traditional approach to phylogenetic inference assumes that a single phylogenetic tree can represent the relationships and divergence between the taxa. However, taxa sequences exhibit varying levels of conservation, e.g. because of regulatory elements and active binding sites. Also, certain bacteria and viruses undergo interspecific recombination, where different strains exchange or transfer DNA subsequences, leading to a tree topology change. We propose a phylogenetic factorial hidden Markov model to detect recombination and rate variation simultaneously. This is applied to two DNA sequence alignments: one bacterial ("Neisseria") and another of type 1 human immunodeficiency virus. Inference is carried out in the Bayesian framework, using reversible jump Markov chain Monte Carlo sampling. Copyright (c) 2009 Royal Statistical Society.
    corecore